With the development of natural language processing techniques(NLP), automatic diagnosis of eye diseases using ophthalmology electronic medical records (OEMR) has become possible. It aims to evaluate the condition of both eyes of a patient respectively, and we formulate it as a particular multi-label classification task in this paper. Although there are a few related studies in other diseases, automatic diagnosis of eye diseases exhibits unique characteristics. First, descriptions of both eyes are mixed up in OEMR documents, with both free text and templated asymptomatic descriptions, resulting in sparsity and clutter of information. Second, OEMR documents contain multiple parts of descriptions and have long document lengths. Third, it is critical to provide explainability to the disease diagnosis model. To overcome those challenges, we present an effective automatic eye disease diagnosis framework, NEEDED. In this framework, a preprocessing module is integrated to improve the density and quality of information. Then, we design a hierarchical transformer structure for learning the contextualized representations of each sentence in the OEMR document. For the diagnosis part, we propose an attention-based predictor that enables traceable diagnosis by obtaining disease-specific information. Experiments on the real dataset and comparison with several baseline models show the advantage and explainability of our framework.
translated by 谷歌翻译
Feature transformation for AI is an essential task to boost the effectiveness and interpretability of machine learning (ML). Feature transformation aims to transform original data to identify an optimal feature space that enhances the performances of a downstream ML model. Existing studies either combines preprocessing, feature selection, and generation skills to empirically transform data, or automate feature transformation by machine intelligence, such as reinforcement learning. However, existing studies suffer from: 1) high-dimensional non-discriminative feature space; 2) inability to represent complex situational states; 3) inefficiency in integrating local and global feature information. To fill the research gap, we formulate the feature transformation task as an iterative, nested process of feature generation and selection, where feature generation is to generate and add new features based on original features, and feature selection is to remove redundant features to control the size of feature space. Finally, we present extensive experiments and case studies to illustrate 24.7\% improvements in F1 scores compared with SOTAs and robustness in high-dimensional data.
translated by 谷歌翻译
Deep neural networks have strong capabilities of memorizing the underlying training data, which can be a serious privacy concern. An effective solution to this problem is to train models with differential privacy, which provides rigorous privacy guarantees by injecting random noise to the gradients. This paper focuses on the scenario where sensitive data are distributed among multiple participants, who jointly train a model through federated learning (FL), using both secure multiparty computation (MPC) to ensure the confidentiality of each gradient update, and differential privacy to avoid data leakage in the resulting model. A major challenge in this setting is that common mechanisms for enforcing DP in deep learning, which inject real-valued noise, are fundamentally incompatible with MPC, which exchanges finite-field integers among the participants. Consequently, most existing DP mechanisms require rather high noise levels, leading to poor model utility. Motivated by this, we propose Skellam mixture mechanism (SMM), an approach to enforce DP on models built via FL. Compared to existing methods, SMM eliminates the assumption that the input gradients must be integer-valued, and, thus, reduces the amount of noise injected to preserve DP. Further, SMM allows tight privacy accounting due to the nice composition and sub-sampling properties of the Skellam distribution, which are key to accurate deep learning with DP. The theoretical analysis of SMM is highly non-trivial, especially considering (i) the complicated math of differentially private deep learning in general and (ii) the fact that the mixture of two Skellam distributions is rather complex, and to our knowledge, has not been studied in the DP literature. Extensive experiments on various practical settings demonstrate that SMM consistently and significantly outperforms existing solutions in terms of the utility of the resulting model.
translated by 谷歌翻译
Spiking neural networks (SNNs) are promising brain-inspired energy-efficient models. Recent progress in training methods has enabled successful deep SNNs on large-scale tasks with low latency. Particularly, backpropagation through time (BPTT) with surrogate gradients (SG) is popularly used to achieve high performance in a very small number of time steps. However, it is at the cost of large memory consumption for training, lack of theoretical clarity for optimization, and inconsistency with the online property of biological learning and rules on neuromorphic hardware. Other works connect spike representations of SNNs with equivalent artificial neural network formulation and train SNNs by gradients from equivalent mappings to ensure descent directions. But they fail to achieve low latency and are also not online. In this work, we propose online training through time (OTTT) for SNNs, which is derived from BPTT to enable forward-in-time learning by tracking presynaptic activities and leveraging instantaneous loss and gradients. Meanwhile, we theoretically analyze and prove that gradients of OTTT can provide a similar descent direction for optimization as gradients based on spike representations under both feedforward and recurrent conditions. OTTT only requires constant training memory costs agnostic to time steps, avoiding the significant memory costs of BPTT for GPU training. Furthermore, the update rule of OTTT is in the form of three-factor Hebbian learning, which could pave a path for online on-chip learning. With OTTT, it is the first time that two mainstream supervised SNN training methods, BPTT with SG and spike representation-based training, are connected, and meanwhile in a biologically plausible form. Experiments on CIFAR-10, CIFAR-100, ImageNet, and CIFAR10-DVS demonstrate the superior performance of our method on large-scale static and neuromorphic datasets in small time steps.
translated by 谷歌翻译
资金机构在很大程度上依赖于领域专家与研究建议之间的主题匹配来分配提案审查员。随着建议越来越跨学科,概述提案的跨学科性质是一项挑战,此后,找到具有适当专业知识的专家审阅者。解决这一挑战的重要步骤是准确对建议的跨学科标签进行分类。现有的方法论和申请相关文献,例如文本分类和提案分类,不足以共同解决跨学科建议数据引入的三个关键独特问题:1)提案的纪律标签的层次结构,谷物,例如,从信息科学到AI,再到AI的基础。 2)在提案中起着不同作用的各种主要文本部分的异质语义; 3)提案的数量在非学科和跨学科研究之间存在不平衡。我们可以同时解决该提案的跨学科性质时的三个问题吗?为了回答这个问题,我们提出了一个层次混音多标签分类框架,我们称之为H-Mixup。 H-Mixup利用基于变压器的语义信息提取器和基于GCN的跨学科知识提取器来解决第一期和第二个问题。 H-Mixup开发了Wold级混音,Word级cutmix,歧管混音和文档级混音的融合训练方法,以解决第三期。
translated by 谷歌翻译
当今,分会一代成为在线视频的实用技术。本章断点使用户能够快速找到所需的零件并获得总结注释。但是,没有公共方法和数据集用于此任务。为了促进该方向的研究,我们介绍了一个名为Chapter-gen的新数据集,该数据集由大约10K用户生成的视频和带注释的章节信息组成。我们的数据收集过程是快速,可扩展的,不需要任何其他手动注释。在此数据集之外,我们设计了一个有效的基线,专门针对视频章节生成任务。捕获视频的两个方面,包括视觉动态和叙述文本。它分别将本地和全球视频功能分别用于本地化和标题生成。为了有效地解析长时间的视频,Skip滑动窗口机构旨在定位潜在的章节。并且开发了交叉注意的多模式融合模块,以汇总标题生成的本地功能。我们的实验表明,所提出的框架比现有方法取得了优越的结果,这表明即使在微调后也无法直接传输类似任务的方法设计。代码和数据集可在https://github.com/czt117/mvcg上找到。
translated by 谷歌翻译
在本文中,我们将解决方案介绍给Muse-Humor的多模式情感挑战(MUSE)2022的邮件,库穆尔人子挑战的目标是发现幽默并从德国足球馆的视听录音中计算出AUC新闻发布会。它是针对教练表现出的幽默的注释。对于此子挑战,我们首先使用变压器模块和BilstM模块构建一个判别模型,然后提出一种混合融合策略,以使用每种模式的预测结果来提高模型的性能。我们的实验证明了我们提出的模型和混合融合策略对多模式融合的有效性,并且我们在测试集中提出的模型的AUC为0.8972。
translated by 谷歌翻译
The peer merit review of research proposals has been the major mechanism for deciding grant awards. However, research proposals have become increasingly interdisciplinary. It has been a longstanding challenge to assign interdisciplinary proposals to appropriate reviewers, so proposals are fairly evaluated. One of the critical steps in reviewer assignment is to generate accurate interdisciplinary topic labels for proposal-reviewer matching. Existing systems mainly collect topic labels manually generated by principal investigators. However, such human-reported labels can be non-accurate, incomplete, labor intensive, and time costly. What role can AI play in developing a fair and precise proposal reviewer assignment system? In this study, we collaborate with the National Science Foundation of China to address the task of automated interdisciplinary topic path detection. For this purpose, we develop a deep Hierarchical Interdisciplinary Research Proposal Classification Network (HIRPCN). Specifically, we first propose a hierarchical transformer to extract the textual semantic information of proposals. We then design an interdisciplinary graph and leverage GNNs for learning representations of each discipline in order to extract interdisciplinary knowledge. After extracting the semantic and interdisciplinary knowledge, we design a level-wise prediction component to fuse the two types of knowledge representations and detect interdisciplinary topic paths for each proposal. We conduct extensive experiments and expert evaluations on three real-world datasets to demonstrate the effectiveness of our proposed model.
translated by 谷歌翻译
功能转换旨在通过数学转换现有功能来提取良好的表示(功能)空间。应对维度的诅咒,增强模型概括,克服数据稀疏性并扩大经典模型的可用性至关重要。当前的研究重点是基于领域的知识特征工程或学习潜在表示;然而,这些方法并非完全自动化,不能产生可追溯和最佳的表示空间。在重建机器学习任务的功能空间时,可以同时解决这些限制吗?在这项扩展研究中,我们提出了一个用于特征转化的自优化框架。为了取得更好的性能,我们通过(1)获得高级状态表示来改善初步工作,以使加强代理能够更好地理解当前功能集; (2)解决Q值高估的Q值高估,以学习无偏见和有效的政策。最后,为了使实验比初步工作更具说服力,我们结论是通过五个数据集添加异常检测任务,评估各种状态表示方法,并比较不同的培训策略。广泛的实验和案例研究表明,我们的工作更有效和更高。
translated by 谷歌翻译
在本文中,我们介绍了2022年多模式情感分析挑战(MUSE)的解决方案,其中包括Muse-Humor,Muse-Rection和Muse Surns Sub-Challenges。 2022年穆斯穆斯(Muse 2022)着重于幽默检测,情绪反应和多模式的情感压力,利用不同的方式和数据集。在我们的工作中,提取了不同种类的多模式特征,包括声学,视觉,文本和生物学特征。这些功能由Temma和Gru融合到自发机制框架中。在本文中,1)提取了一些新的音频功能,面部表达功能和段落级文本嵌入以进行准确的改进。 2)我们通过挖掘和融合多模式特征来显着提高多模式情感预测的准确性和可靠性。 3)在模型培训中应用有效的数据增强策略,以减轻样本不平衡问题并防止模型形成学习有偏见的主题字符。对于博物馆的子挑战,我们的模型获得了0.8932的AUC分数。对于Muse Rection子挑战,我们在测试集上的Pearson相关系数为0.3879,它的表现优于所有其他参与者。对于Muse Surst Sub-Challenge,我们的方法在测试数据集上的唤醒和价值都优于基线,达到了0.5151的最终综合结果。
translated by 谷歌翻译